ROC Analysis of Example Weighting in Subgroup Discovery

نویسندگان

  • Branko Kavsek
  • Nada Lavrac
  • Ljupco Todorovski
چکیده

This paper presents two new ways of example weighting for subgroup discovery. The proposed example weighting schemes are applicable to any subgroup discovery algorithm that uses the weighted covering approach to discover interesting subgroups in data. To show the implications that the new example weighting schemes have on subgroup discovery, they were implemented in the APRIORI-SD algorithm. ROC analysis was then used to study their behavior, and the behavior of APRIORI-SD’s original example weighting scheme, both theoretically and practically, by application on the UK Traffic challenge data set. The findings show that the proposed example weighting schemes are a valid alternative to APRIORI-SD’s original example weighting scheme when the goal is to discover fewer subgroups that are either small and highly accurate or large and less accurate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Example Weighting in Subgroup Discovery by Comparison of Three Algorithms on a Real-life Data Set

This paper investigates the implications of example weighting in subgroup discovery by comparing three state-of-the-art subgroup discovery algorithms, APRIORI-SD, CN2-SD, and SubgroupMiner on a real-life data set. While both APRIORI-SD and CN2-SD use example weighting in the process of subgroup discovery, SubgroupMiner does not. Moreover, APRIORI-SD uses example weighting in the post-processing...

متن کامل

Rule induction for subgroup discovery with CN2-SD

Rule learning is typically used in solving classification and prediction tasks. However, learning of classification rules can be adapted also to subgroup discovery. This paper shows how this can be achieved by modifying the CN2 rule learning algorithm. Modifications include a new covering algorithm (weighted covering algorithm), a new search heuristic (weighted relative accuracy), probabilistic...

متن کامل

A ‎n‎ew weighting approach to Non-Parametric composite indices compared with principal components analysis‎

Introduction of Human Development Index (HDI) by UNDP in early 1990 followed a surge in use of non-parametric and parametric indices for measurement and comparison of countries performance in development, globalization, competition, well-being and etc. The HDI is a composite index of three indicators. Its components are to reflect three major dimensions of human development: longevity, knowledg...

متن کامل

ROCsearch in a Wider Context — A ROC-Guided Search Strategy for Subgroup Discovery and Beyond

ROCsearch is a ROC-based beam search variant, initially developed for Subgroup Discovery (SD). In ordinary beam search, on each search level, a fixed number of best-scoring candidates are selected to generate candidates for the next search level. This fixed number, the beam width, is typically hard to set, and its setting strongly influences the outcome of the mining process. In ROCsearch, howe...

متن کامل

Using Subgroup Discovery to Analyze the UK Traffic Data

Rule learning is typically used in solving classification and prediction tasks. However, learning of classification rules can be adapted also to subgroup discovery. Such an adaptation has already been done for the CN2 rule learning algorithm. In previous work this new algorithm, called CN2-SD, has been described in detail and applied to the well known UCI data sets. This paper summarizes the mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004